Maximum likelihood pedigree reconstruction using integer linear programming.
نویسندگان
چکیده
Large population biobanks of unrelated individuals have been highly successful in detecting common genetic variants affecting diseases of public health concern. However, they lack the statistical power to detect more modest gene-gene and gene-environment interaction effects or the effects of rare variants for which related individuals are ideally required. In reality, most large population studies will undoubtedly contain sets of undeclared relatives, or pedigrees. Although a crude measure of relatedness might sometimes suffice, having a good estimate of the true pedigree would be much more informative if this could be obtained efficiently. Relatives are more likely to share longer haplotypes around disease susceptibility loci and are hence biologically more informative for rare variants than unrelated cases and controls. Distant relatives are arguably more useful for detecting variants with small effects because they are less likely to share masking environmental effects. Moreover, the identification of relatives enables appropriate adjustments of statistical analyses that typically assume unrelatedness. We propose to exploit an integer linear programming optimisation approach to pedigree learning, which is adapted to find valid pedigrees by imposing appropriate constraints. Our method is not restricted to small pedigrees and is guaranteed to return a maximum likelihood pedigree. With additional constraints, we can also search for multiple high-probability pedigrees and thus account for the inherent uncertainty in any particular pedigree reconstruction. The true pedigree is found very quickly by comparison with other methods when all individuals are observed. Extensions to more complex problems seem feasible.
منابع مشابه
Maximum likelihood pedigree reconstruction using integer programming
Abstract Pedigrees are ‘family trees’ relating groups of individuals which can usefully be seen as Bayesian networks. The problem of finding a maximum likelihood pedigree from genotypic data is encoded as an integer linear programming problem. Two methods of ensuring that pedigrees are acyclic are considered. Results on obtaining maximum likelihood pedigrees relating 20, 46 and 59 individuals a...
متن کاملImproved maximum likelihood reconstruction of complex multi-generational pedigrees.
The reconstruction of pedigrees from genetic marker data is relevant to a wide range of applications. Likelihood-based approaches aim to find the pedigree structure that gives the highest probability to the observed data. Existing methods either entail an exhaustive search and are hence restricted to small numbers of individuals, or they take a more heuristic approach and deliver a solution tha...
متن کامل: a program for pedigree relationship reconstruction and kin group assignments using genetic markers
KINGROUP is an open source java program implementing a maximum likelihood approach to pedigree relationships reconstruction and kin group assignment. kingroup implements a new method (currently being performance tested) for reconstructing groups of kin that share a common relationship by estimating an overall likelihood for alternative partitions. A number of features found in KINSHIP (Goodnigh...
متن کاملComputing the Minimum Recombinant Haplotype Configuration from Incomplete Genotype Data on a Pedigree by Integer Linear Programming
We study the problem of reconstructing haplotype configurations from genotypes on pedigree data with missing alleles under the Mendelian law of inheritance and the minimum-recombination principle, which is important for the construction of haplotype maps and genetic linkage/association analyses. Our previous results show that the problem of finding a minimum-recombinant haplotype configuration ...
متن کاملSolving Single Machine Sequencing to Minimize Maximum Lateness Problem Using Mixed Integer Programming
Despite existing various integer programming for sequencing problems, there is not enoughinformation about practical values of the models. This paper considers the problem of minimizing maximumlateness with release dates and presents four different mixed integer programming (MIP) models to solve thisproblem. These models have been formulated for the classical single machine problem, namely sequ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Genetic epidemiology
دوره 37 1 شماره
صفحات -
تاریخ انتشار 2013